3,041 research outputs found

    Aspirations as reference points: an experimental investigation of risk behavior over time

    Get PDF
    This paper examines the importance of aspirations as reference points in a multi-period decision-making context. After stating their personal aspiration level, 172 individuals made six sequential decisions among risky prospects as part of a choice experiment. The results show that individuals make different risky-choices in a multi-period compared to a single-period setting. In particular, individuals’ aspiration level is their main reference point during the early stages of decision-making, while their starting status (wealth level at the start of the experiment) becomes the central reference point during the later stages of their multi-period decision-making.Arvid O. I. Hoffmann; Sam F. Henry; Nikos Kalogera

    Partitioning Strategies for Concurrent Programming

    Get PDF
    This work presents four partitioning strategies, or patterns, useful for decomposing a serial application into multiple concurrently executing parts. These partitioning strategies augment the commonly used task and data parallel design patterns by recognizing that applications are spatiotemporal in nature. Therefore, data and instruction decomposition are further distinguished by whether the partitioning is done in the spatial or in temporal dimension. Thus, this work describes four decomposition strategies: spatial data partitioning (SDP), temporal data partitioning (TDP), spatial instruction partitioning (SIP), and temporal instruction partitioning (TIP), while cataloging the benefits and drawbacks of each. In addition, the practical use of these strategies is demonstrated through a case study in which they are applied to implement several different parallelizations of a multicore H.264 encoder for HD video. This case study illustrates both the application of the patterns and their effects on the performance of the encoder

    Approximation Algorithms for Scheduling with Resource and Precedence Constraints

    Get PDF
    We study non-preemptive scheduling problems on identical parallel machines and uniformly related machines under both resource constraints and general precedence constraints between jobs. Our first result is an O(logn)-approximation algorithm for the objective of minimizing the makespan on parallel identical machines under resource and general precedence constraints. We then use this result as a subroutine to obtain an O(logn)-approximation algorithm for the more general objective of minimizing the total weighted completion time on parallel identical machines under both constraints. Finally, we present an O(logm logn)-approximation algorithm for scheduling under these constraints on uniformly related machines. We show that these results can all be generalized to include the case where each job has a release time. This is the first upper bound on the approximability of this class of scheduling problems where both resource and general precedence constraints must be satisfied simultaneously

    Remote Store Programming: Mechanisms and Performance

    Get PDF
    This paper presents remote store programming (RSP). This paradigm combines usability and efficiency through the exploitation of a simple hardware mechanism, the remote store, which can easily be added to existing multicores.Remote store programs are marked by fine-grained and one-sided communication which results in a stream of data flowing from the registers of a sending process to the cache of a destination process. The RSP model and its hardware implementation trade a relatively high store latency for a low load latency because loads are more common than stores, and it is easier to tolerate store latency than load latency. This paper demonstrates the performance advantages of remote store programming by comparing it to both cache-coherent shared memory and direct memory access (DMA) based approaches using the TILEPro64 processor. The paper studies two applications: a two-dimensional Fast Fourier Transform (2D FFT) and an H.264 encoder for high-definition video. For a 2D FFT using 56 cores, RSP is 1.64x faster than DMA and 4.4x faster than shared memory. For an H.264 encoder using 40 cores, RSP achieves the same performance as DMA and 4.8x the performance of shared memory. Along with these performance advantages, RSP requires the least hardware support of the three. RSP's features, performance, and hardware simplicity make it well suited to the embedded processing domain

    Quantifying Wraparound Health Insurance Needs among Employed People with Disabilities

    Get PDF
    A presentation about insurance coverage for health care services and supports for people with disabilities who work. “Wrap-around” coverage (or other policy) options may be a viable solution and support employment among people with disabilities. Presentation for the 2015 Academy Health Disability Research Interest Group

    Managing performance vs. accuracy trade-offs with loop perforation

    Get PDF
    Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine learning algorithms) are designed to trade off accuracy in return for increased performance. To date, such computations typically use ad-hoc, domain-specific techniques developed specifically for the computation at hand. Loop perforation provides a general technique to trade accuracy for performance by transforming loops to execute a subset of their iterations. A criticality testing phase filters out critical loops (whose perforation produces unacceptable behavior) to identify tunable loops (whose perforation produces more efficient and still acceptably accurate computations). A perforation space exploration algorithm perforates combinations of tunable loops to find Pareto-optimal perforation policies. Our results indicate that, for a range of applications, this approach typically delivers performance increases of over a factor of two (and up to a factor of seven) while changing the result that the application produces by less than 10%

    Quality of service profiling

    Get PDF
    Many computations exhibit a trade off between execution time and quality of service. A video encoder, for example, can often encode frames more quickly if it is given the freedom to produce slightly lower quality video. A developer attempting to optimize such computations must navigate a complex trade-off space to find optimizations that appropriately balance quality of service and performance. We present a new quality of service profiler that is designed to help developers identify promising optimization opportunities in such computations. In contrast to standard profilers, which simply identify time-consuming parts of the computation, a quality of service profiler is designed to identify subcomputations that can be replaced with new (and potentially less accurate) subcomputations that deliver significantly increased performance in return for acceptably small quality of service losses. Our quality of service profiler uses loop perforation (which transforms loops to perform fewer iterations than the original loop) to obtain implementations that occupy different points in the performance/quality of service trade-off space. The rationale is that optimizable computations often contain loops that perform extra iterations, and that removing iterations, then observing the resulting effect on the quality of service, is an effective way to identify such optimizable subcomputations. Our experimental results from applying our implemented quality of service profiler to a challenging set of benchmark applications show that it can enable developers to identify promising optimization opportunities and deliver successful optimizations that substantially increase the performance with only small quality of service losses

    Application Heartbeats for Software Performance and Health

    Get PDF
    Adaptive, or self-aware, computing has been proposed as one method to help application programmers confront the growing complexity of multicore software development. However, existing approaches to adaptive systems are largely ad hoc and often do not manage to incorporate the true performance goals of the applications they are designed to support. This paper presents an enabling technology for adaptive computing systems: Application Heartbeats. The Application Heartbeats framework provides a simple, standard programming interface that applications can use to indicate their performance and system software (and hardware) can use to query an applicationâ s performance. Several experiments demonstrate the simplicity and efficacy of the Application Heartbeat approach. First the PARSEC benchmark suite is instrumented with Application Heartbeats to show the broad applicability of the interface. Then, an adaptive H.264 encoder is developed to show how applications might use Application Heartbeats internally. Next, an external resource scheduler is developed which assigns cores to an application based on its performance as specified with Application Heartbeats. Finally, the adaptive H.264 encoder is used to illustrate how Application Heartbeats can aid fault tolerance
    • …
    corecore